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ABSTRACT 



New Hampshire has adopted a standards -based statewide 
assessment, the New Hampshire Educational Improvement and Assessment Program 
(NHEIAP) , which is designed to measure students' learning against proficiency 
standards at grades 3, 6, and 10. Because of the difficulty teachers had in 
interpreting the NHEIAP results, a custom-designed software program, the 
NHEIAP Data Interpreter, was developed to be a user-friendly, transparent 
means of examining district- or building-level groups of students. The NHEIAP 
Data Interpreter is a Windows program that displays the data for various 
groups in raw form or in frequency charts. The power of the program is in the 
subtlety with which it can select and compare groups of students, allowing 
educators to focus on the meaning of the data. Copies of the program on 
CD-ROM are available from the paper's authors. A sample NHEIAP mathematics 
problem is attached. (Contains six tables.) (SLD) 
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Helping Teachers Interpret Item-Level Data: The New Hampshire Statewide Assessment 

Nancy R. Cook, Ph.D., Notre Dame College, Manchester, NH \ 

Robert A Smith, Ph D., TRIERE Research, Deerfield, NH 

ON 

o 

Abstract 

CO . 

^ This paper is designed to accompany a technology-based roundtable at AERA (Montreal, 

§ 1999). The session demonstrates an original software program designed to help teachers, 

administrators, and other education policy-makers understand the item-level data 
provided by the NH Statewide Assessment (NHELAP). This standards-based and mixed- 
format assessment is administered to all students in New Hampshire at the end of Grades 
3, 6, and 10. The program includes data from each grade-level group (n=13,000 to 
16,000). The NHEIAP Data Interpreter, a custom-designed software program, provides a 
user-friendly transparent means of examining district and/or building level groups of 
students. Copies of the program are available at minimal cost from the authors. 

Introduction 

Like many states. New Hampshire adopted a standards-based state-wide assessment in 
this decade. The New Hampshire Educational Improvement and Assessment Program 
(NHEIAP) resulted in the composition of K-12 curriculum frameworks in four core 
academic content areas, mathematics, English language arts, science and social studies. 

The new state-wide assessment was designed to measure students’ learning against 
proficiency standards at three tested grade levels (3, 6 & 10). The assessment included 
multiple choice items and open-response items (and additionally a longer writing sample 
as part of the English language arts assessment). The purported goal of the state-wide 
assessment was to help schools and districts improve student learning relative to the 
standards. Four score levels were defined: Novice, Basic, Proficient and Advanced with 
cut-points being determined primarily on the basis of performance on the open-ended 
items. 



Despite this goal of helping school districts move local curriculum into alignment with 
the standards in the Curriculum Frameworks , a number of problems presented barriers to 
accomplishing this goal. First, New Hampshire had a sporadic history of state-wide 
assessment. Up to the present year (1999) 90% of school funding was derived from local 
property taxes and consequently there were highly variable levels of school finances 
since communities in New Hampshire range from very affluent small towns/suburbia in 
the southern tier of the state (near Massachusetts) to very small poor rural northern 
locales (near Vermont and Maine). 



A second problem was that past state-wide assessment tended to be nationally normed 
achievement tests (e.g., the California Achievement Test or the Stanford). On such 
measures, (as well as on the SAT), New Hampshire students did very well. Since the 
majority of these students are relatively affluent and white, such a standing is easily 
understood. However, in many districts, these scores promoted the notion that NH 
schools were functioning well. 
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The third problem with meeting the goal of school improvement arose when the first 
assessment was administered (the first assessment was administered in 1995 to Grade 3 
only). The Curriculum Frameworks had limited distribution at that point and the format 
of the test (especially the open-ended items) was unfamiliar, at best, to NH educators. As 
a consequence, scores on these first NHEIAP’s were fairly low with a majority of 
students at the Basic and Novice levels in many districts and buildings. The poor scores 
were widely disseminated through the media (including the well-known Union Leader 
newspaper). The low scores, as well as the publicity, seemed to create a negative 
response among educators to the entire assessment and slow the reform efforts the 
assessment was to promote. 

The final problem that interfered with the use of the state-wide assessment data being 
used to reform curriculum and instruction was the form in which data were sent back to 
local districts. Hard copies of students’ responses to various items and state-wide percent 
correct were the most understandable forms, but this required arduous amounts of time to 
translate into meaningful guides for actions. Because of these problems, the NHEIAP 
Data Interpreter was developed. This custom-made software program allows decision- 
makers at the district and building levels to quickly identify those items (and the related 
standards) where local curricula and instructional practices are working well, needing 
minor attention and needing major improvement (such as curricular alignment). 

The NHEIAP Data Interpreter 

The NHEIAP Interpreter is an IBM/PC Windows program, which displays NHEIAP data 
from various groups of New Hampshire students. Data can be displayed in raw form (for 
individual students) or shown in frequency charts to show the percentage of students 
giving each possible response to any desired question. The real power of the program 
lies in the subtlety with which it can select and compare different groups of students. 
Selection is a three-tier process. Initially the operator specifies a demographic group, 
including the grade tested and the year, followed by the district and/or school. The user 
can choose one school or all the schools in one district. It is also possible to choose all 
districts in NH, but this requires a fairly lengthy time to compile as there are 
approximately 13,000 to 16,000 student represented. The sample is now well enough 
defined that the user can choose either a raw data box (“Show Data’) or an analysis box 
(“Analyze Data”), showing data or percentages for the entire sample. 

After choosing the analysis box, the user must define two more parameters: content area 
(mathematics, English language arts, science or social studies) and format (multiple 
choice). Table 1 displays the initial results for the analysis box having chosen math 
multiple choice in Monadnock Regional High School. All the results in this paper are 
from the 1998 Grade 10 cohort at Monadnock, which is located in the southwest comer 
of New Hampshire. The school enrolls approximately 1300 students in Grades 7-12. 

The school district is primarily rural, in the top 1/3 of NH districts for educational need 
indices (e.g. free lunch eligibility), with the largest proportion of adults employed in blue 
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collar jobs. The first author recently spent 2 hours with the curriculum leaders at the 
school to explore the meaning of data generated by the Interpreter. 



Table 1 - Math Multiple Choice - Monadnock Regional High School - all students 
(n=178) 
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Above this table are options for further disaggregation of various student groups. The 
first options allow the selection of students at one or more score level. For example a 
user might choose only the “novice” students or he might combine “proficient” and 
“advanced” level students. (We typically recommend that combination since there are 
very few advanced students in any building/ district group.) The second set of options 
appears on the second line below the score level options. This line allow separation of 
students with and without disabilities (“coded” or “non-coded”). These options can also 
be mixed with the score level options — for example “novices who are coded” as opposed 
to “basic who are coded.” By checking the appropriate boxes (and then “redisplaying’), 
the item results are quickly displayed. 

The open-ended items on the NHEIAP are scored on a four-point scale using item- 
specific rubrics which are released with the items. Table 2 displays the results for the 
same building group in mathematics on the four open-response items. 



Table 2 - Math Open-Response Items - Monadnock Regional High School - all students 
(N=178) 
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As with the multiple choice items the same disaggregation options are available. 
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Changing Curriculum and Instruction 

Once the user has chosen the sub-groups for analysis, (s)he can examine the behavior of 
these students to ascertain areas of positive practice, areas needing minor change and 
those needing major curriculum efforts. The first way to do this is by identifying items 
in three difficulty-based groups: [1] “floor” items where 80% or more of the specified 
group answers correctly or 3-4 points on the open-ended items; [2] “ceiling” items where 
less than 40% of the specified group answer correctly or 80% scored less than 2 on the 
open-ended items; and [3] items in the group’s “zone of proximal development,” those 
where the percent correct ranges between 40-80% (or the majority of students score 2-3 
on the open-ended items). The use of the Vygotskian term of proximal development 
reinforces the idea that instruction should be grounded in what the group already can do 
(the floor items representing those skills) and should provide scaffolding until the group 
finds the skills represented in the zone to be “easy.” Those skills represented by the 
ceiling items usually reflect areas where curricular decisions must be addressed as well. 
An interesting example of such a item was found in math at the Monadnock Regional 
High School. 



Table 3 - Sample ceiling item - Monadnock Regional High School (N-178) 

#23. Jon wants his friends to guess what coins he has in his pockets. He gives them some 
clues: 

He has 10 coins. 

The coins are worth $1.60. 

He has some nickels, some dimes and some quarters. 

A1 guesses that the coins are five quarters, three dimes, and one nickel. Al’s guess is 
wrong because: 

a. the value is less than $1.60. 

b. the value is more than $1 .60. 

c. he used more than 10 coins 

d. he used less than 10 coins 



Results of sub-groups 
Math item 23 
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• Group too small to be reliable indicator. 



As a result of this pattern, the mathematics department discovered there were few 
opportunities in Monadnock’s 7-10 curriculum to address such “logic” problems. The 
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math curriculum for lower level students tended to emphasize computation. The 
algebra/geometry cycle began for upper level students in Grade 8 and was fairly 
traditional in its focus. 

Another problem in Monadnock’s math curriculum was visible from the performance of 
subgroups on the open-ended items. Open-ended item 1 (labeled #18 in the attached 
sheet) required students to find the area of a isosceles triangle, draw a triangle of the 
same area that was not congruent with the first, and then find the area of another right 
triangle. All elements required the student to explain his work. This item shows a 
dramatically different scores for the students at the four basic score levels. 
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As can be seen by these data, students at the “proficient” level did far better at accurately 
completing the three tasks (represented by 2-3 point scores) and explaining their work (3- 
4 points). 

Dramatic differences between score groups can also provide information about areas 
where curricular changes are needed. A good example of such an item appeared in the 
English Language Arts items for Monadnock: 



Table 5 - English Language Arts Item 1 5 

/ 

15. (Referring to a passage on biological classification schemes) What was the main 
problem with Aristotle’s classification system? 

a. He divided organisms into plants and animals. 

b. He used the term “species” to mean similar life forms. 

c. He chose to base his system on natural habitats. (Correct response) 

d. He focused on similarities and differences in grouping organisms. 



Sub-group responses 
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In this case, English and science teachers agreed that lower track students were required 
to read far less non-fiction (in both science and English) than were their higher level 
peers. This lead to questions about reading across the curriculum and content area 
teachers’ responsibilities for reading skills. 

A final method of identifying areas for improvement through the Data Interpreter is 
through examination of error patterns. With a “guessing” pattern, equivalent 
percentages of the group choose 2-3 options which usually indicates need for curriculum 
alignment With “distractor pulls,” large proportion of students choose 1 incorrect 
response. This usually indicates “naive understanding” and a need for more instructional 
time allocated to the standard associated with the item. A very interesting item with both 
a strong difference between groups and distractor pull error pattern appeared in the Social 
Studies Multiple Choice items in Monadnock. 



Table 6 - Social Studies Item 5 

A leader of the Russian Revolution who became the first head of the Soviet Union was 

a. Gorbachev 

b. Nicholas II 

c. Lenin (Correct response) 

d. Trotsky 



Sub-group responses 
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Again, the pattern here was explained by the differentiated social studies curriculum for 
upper and lower track students wherein the higher track students (where the majority 
were “proficient”) were taught World History before the 10 th grade assessment was 
administered. 



Conclusions 

As can be seen from these examples, user-friendly high-level computer software can 
lessen the tedium of appropriate data analysis. The NHELAP Interpreter program is 
transparent for most users. Its interface is simple and straightforward so the “computer 
literacy” demands are virtually nonexistent. The program allows educators to focus on 
the meaning of the data, rather than grappling with the computer to yield meaningful 
data. It is provided at no cost to NH school districts, while many commercial programs 
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are very expensive and oriented toward the needs and expertise of measurement 
professionals. These more sophisticated programs do not necessarily provide the 
information that many districts need to examine their curriculum against the data the 
statewide assessment provides as the Data Interpreter does. Use of the program in New 
Hampshire has been very successful. Many districts have used the interpreter to identify 
strengths and weaknesses in their own instructional programs. Because of the radically 
different financial needs in New Hampshire districts where 90% of school funding is 
based on local property taxes, poorer districts have little to no capacity within them to 
fund such a state-of-the-art computer program. The ease of use has also been a boon to 
these districts where there is not much of a technology base, either in hardware or in 
expertise. 

Finally, the Interpreter illustrates how very simple, intuitively obvious, data analysis can 
be extremely powerful for educators seeking to analyze their local curriculum. 
Classroom teachers can examine what are the likely skills of a group dominated by 1-2 
score levels; principals can examine the differences between cohorts within their 
buildings and district administrators can focus staff development and curriculum 
alignment efforts more precisely by use of the Interpreter. 

Copies of the CD-ROM and documentation are available for $5.00 by contacting the 
authors of this paper. 



Nancy R. Cook, Ph.D. 

Robert A Smith, Ph.D. 
TRIERE Research 
219 South Road 
Deerfield, NH 03037 
Phone. 603-463-7017/ FAX. 463-5952 
E-mail: nrc@nancy.mv.com 
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.Answer open-response questions 18 and 19. Be sure to ^ 
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work in your head, explain how you did your work with words or equations 

’ l ,abe ' th6m aPPr ° Priate,y - < S ° me ^ve more 

’ fTom e o S th r er IZp™ Zl o'woSs. " ClrC ' e y0Ur fl " a ' anSWerS t0 S6t them off 
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Curriculum Goal 4 

Use the grid below to answer question 18. 



State Average Score: 1.1 



1 8. a. 

b. 



c. 




If the shaded square has an area of one square unit, what is the area of triangle ABC? 

you [ S,udem Res P°" se Booklet, draw another triangle that is not congruent 
triangle ABC, but has the same area. Explain how you know the areas are the same. 

L h !h a h t yP t° ten fo 1 th6 isosceles ri 9 ht triangle below has a length of 4 units. What is the area 
of that triangle? Show your work. 
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